LJ _ ^  An-A28l  219  sE 

form  Approved 

OMB  No  0/04  OtBd 

PvWn.  tmiodinq  ih^  time  lor  •otirucuoot.  ve4r(h<rv)  f  ■•^1*09  d4i« 

94t^efin9  4nd  m«totA4n4A9  i  Mwil M  Wl Hii  Mm IHI  Mil  BH  Ml  nation  Eommrnit  reQ4rd«n9  iha  burdrA  or  Any  other  ol  ihn 

<oll«<tiOA  o<  *Afo<ih4tK>o.  H  MwIhI  Ww Mil  M\1  Bh  Ml  0«reciO«4te  for  <Aforrt«4l»OA  Op^r4i«on^  «nd  KeporT%  Wiv  ietlerton 

0«v««H«9hwr4y.  Sorte  1204.  et.  P4per*«»ork  «educt*OA  ^ro|4<l  (0f04  0  *88).  WAthmgton.  OC  JOSOl 

1.  AGENCY  USE  ON  - ...  uaIE  3.  REPORT  TYPE  AND  OATES  COVERED 

1  5/19/94  Final  Technical, 01  Apr  92  -  31  Mar  94 

4.  TITLE  AND  SUBTITLE 

Stochastic  Control  and  Nonlinear  Estimation 

S.  FUNDING  NUMBERS 

F49620-92-J-0081 

6.  AUTHOR(S) 

Wendell  H.  Fleming  &  Harold  J.  Kushner 

_  DTir 

7.  PERFORMING  ORGANIZATION  NAME(S)  ANO  AOORESS(ES)  ■y-P  S  1  V,f^> 

Division  of  Applied  Mathematics 

Brown  University  1  ..1. »  m 

182  George  Street  M 

Providence,  RI  02912  ^ 

8.  PERFORMING  ORGANIZATION 

REPORT  NUMBER 

2 

tfOSR-TR*  94  0  3  9  3 

9.  SPONSORING /MONITORING  AGENCY  NAME(S)  ANO  AOORESS(ES) 

/ ! ' 


AFOSR/NM 

110  Duncan  Ave.,  Suite  B115 
Bolling  AFB,  DC  20332-6448 


It.  SUPPLEMENTARY  NOTES 


94-20716 

liiinniH 


12a.  DISTRIBUTION /AVAILA8IUTY  STATEMENT 


12b.  distribution  code 


APPROVED  FOR  PUBLIC  RELEASE: 
DISTRIBUTION  UNLIMITED 


13.  ABSTRACT  (Maximum  200  words) 

W.H.  Fleming's  work  during  this  period  concerned  risk  sensitive  stochastic  control, 
and  related  questions  about  differential  games.  This  theory  provides  a  link  between 
stochastic  and  deterministic  (robust  control)  approaches  to  disturbance  attenuation 
problems.  H. J.Kushner ' s  work  developed  efficient,  general  stochastic  approximation 
methods  for  improving  thqfcperation  of  continuous  or  discrete  event  dynamical  systeirs 
over  a  long  time  period.  Applications  to  communication  problems  include  large 
controlled  multiplexing  systems,  which  are  approximated  by  diffusion  type  processes. 
The  method  yields  a  very  efficient  way  of  approximation  as  well  as  good  numerical 
methods . 


DTIO  QUALITY  INSPECTED  6 


14.  SUBJECT  TERMS 

Of!*  '?'  ■  < 

15.  NUMBER  OF  PAGES 

4 

>  ■  2  V  5 

16.  PRICE  COOE 

17.  SECURITY  CLASSIFICATION 

OF  REPORT 

18.  SECURITY  CLASSIFICATION 

OF  THIS  PAGE 

19.  security  CLASSIFICATION 

OF  abstract 

20,  LIMITATION  OF  ABSTRACT 

NSN  7SC0.01  ^00-5500 


Standard  form  (Rev  2-8S) 

t>y  AN'.!  SlO  /JS-M 


WOSRTR.  94  0393 


Approved  for 

<*istrlbutioa 


public  reieaa* I 


Final  Technical  Report 


This  is  a,  summary  of  research  by  W.  H.  Fleming  and  H.  J.Kushner  done 
under  AFOSR  Grant  F49620-92-J-0081  “Stochastic  Control  and  Nonlinear 
Estimation”. 

W.  H.  Fleming  worked  on  risk  sensitive  control  together  with  related 
questions  in  robust  nonlinear  control  and  differential  games.  Risk  sensi¬ 
tive  control  theory  provides  a  link  between  deterministic  and  stochastic  ap¬ 
proaches  to  disturbance  attenuation  problems.  It  uses  ideas  from  the  theory 
of  large  deviations  for  stochastic  processes. 

In  work  with  W.  M.  McEneaney,  risk  sensitive  control  problems  were 
considered  on  both  finite  and  infinite  time  horizons  for  nonlinear  systems 
described  by  stochastic  differential  equations.  Logarithmic  transformations 
were  applied  to  the  associated  optimal  cost  functions  for  finite-time  hori¬ 
zon  problems.  The  value  function  for  a  zero-sum,  two-controller  differential 
game  was  obtained  in  the  limit,  as  a  small  parameter  (which  represents  noise 
intensity)  tends  to  zero.  Convergence  to  the  value  function  of  the  differ¬ 
ential  game  was  proved  by  viscosity  solution  methods  for  nonlinear  partial 
differential  equations. 

For  risk  sensitive  control  on  an  infinite  time  horizon,  logarithmic  trans¬ 
formations  lead  to  stochastic  differential  games  with  an  ergodic  (average  cost 
per  unit  time)  payoff  criterion.  The  value  of  the  stochastic  differential  game 
is  an  optimal  long-term  growth  rate  of  expected  exponential  cost,  or  equiva¬ 
lently  an  optimal  Donsker-Varadhan  large  deviations  rate. 

To  obtain  results  about  robust  nonlinear  control,  the  noise  intensity  for 
the  infinite  horizon  risk  sensitive  problem  was  made  to  tend  to  zero.  A  crucial 
question  turned  out  to  be  whether,  in  this  deterministic  limit,  the  optimal 
long-term  growth  rate  is  zero  or  positive.  If  it  is  zero,  then  a  dissipation 
inequality  which  plays  a  key  role  in  robust  nonlinear  control  theory  holds. 

In  work  with  M.  J.  James  the  dependence  of  the  risk-sensitive  index  (i.e. 
the  optimal  long  term  growth  rate)  on  an  additional  small  parameter  was 


examined.  This  parameter  corresponds  to  the  reciprocal  of  an  operator  norm 
bound  familiar  in  robust  /H-infinity  control.  Depending  on  the  relative  sizes 
of  this  parameter  and  another  parameter  indicating  noise,  a  mixture  of  H2 
and  Hoo  norms  for  nonlinear  systems  is  obtained. 

The  ideas  outlined  above  are  being  explored  via  numerical  experiments.  A 
risk  sensitive  control  formulation  of  a  model  for  semi-active  vehicle  suspension 
was  used  to  illustrate  the  method. 

Fleming  and  Soner  completed  a  research  monograph  on  “Controlled  Markov 
Processes  and  Viscosity  Solutions”. 

H.  J.  Kushner’s  work  covered  many  aspects  of  modern  stochastic  con¬ 
trol.  Methods  for  stochastic  approximation  with  averaging  of  the  iterates 
which  yield  optimal  rates  of  convergence  were  developed.  These  results 
greatly  reduce  the  traditional  difficulty  of  selecting  good  step  size  sequences. 
They  yield  asymptotic  results  which  are  equivalent  to  what  we  would  get  if 
we  used  the  optimal  matrix  valued  gain  for  the  step  sizes. 

A  major  accomplishment  was  the  completion  of  the  book  on  new  and  very 
powerful  numerical  methods  in  stochastic  control.  It  covers  the  great  bulk 
of  the  formulations  of  the  continuous  time  problems  which  have  appeared  to 
date,  as  well  as  newer  and  less  well  known  formulations:  reflecting  bound¬ 
ary  problems  (for  example  from  the  heavy  traffic  approximations),  singular 
controls,  ergodic  problems,  etc.  These  methods  are  becoming  the  numerical 
methods  of  choice  for  stochastic  control  problems  in  continuous  time. 

We  developed  an  effective  method  for  the  modelling  and  optimal  control 
of  large  trunk  line  systems  under  heavy  traffic  as  well  as  for  the  numerical 
solution.  The  controls  are  decisions  concerning  rerouting,  and  pose  serious 
non  standard  difficulties.  The  optimal  costs  for  the  network  are  well  approx¬ 
imated  by  optimal  costs  for  the  heavy  traffic  limit. 

Numerical  solutions  show  that  the  methods  actually  do  provide  excel¬ 
lent  strategies  in  realistic  situations.  These  results  show  the  great  power  of 
modern  methods  in  stochastic  control  for  the  treatment  of  difficult  and  very 
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practical  problems. 

Large  controlled  multiplexing  systems  are  approximated  by  diffusion  type 
processes.  This  yields  a  very  efficient  way  of  approximation  as  well  as  good 
numerical  methods.  The  “limit”  equations  are  an  efficient  aggregation  of 
the  original  system,  and  they  provide  the  basis  of  the  good  numerical  ap¬ 
proximations  for  the  control  problem.  The  numerical  approximations  have 
the  structure  of  the  original  problem,  but  are  generally  much  simpler.  The 
control  can  occur  in  a  variety  of  places;  e.g.,  the  type  of  “leaky  bucket” 
controllers,  control  of  “marked  cells”  at  the  transmitter  buffer,  etc.  These 
are  equivalent.  Various  forms  of  the  optimal  control  problem  have  been  ex¬ 
plored,  where  the  aim  was  to  control  or  balance  the  losses  at  the  control  with 
those  due  to  buffer  overflow.  These  are  typical  of  many  possibilities.  The 
extensive  numerical  experiments  show  that  much  can  be  saved  via  the  use  of 
optimal  controls  or  reasonable  approximations  to  them.  We  discuss  systems 
with  several  classes  of  sources,  various  aggregation  methods  and  control  ap¬ 
proximation  schemes.  The  results  show  that  the  approach  is  a  very  useful 
tool  for  providing  both  qualitative  and  quantitative  information  on  problems 
in  ATM  and  broadband  integrated  data  networks.  This  would  be  hard  to 
get  otherwise,  and  amply  demonstrates  the  power  of  modern  techniques  in 
stochastic  control  for  the  effective  treatment  of  problems  of  great  interest 
and  complexity.  We  developed  efficient  and  general  stochastic  approxima¬ 
tion  (SA)  methods  for  improving  the  operation  of  parametrized  systems  of 
either  the  continuous  or  discrete  event  dynamical  systems  types.  The  em- 
pheisis  was  on  systems  which  operate  over  a  very  long  time  period.  The 
number  of  applications  is  increasing  at  a  rapid  rate.  This  is  partly  due  to 
the  increasing  activity  in  computing  pathwise  derivatives  and  adapting  them 
to  the  average  cost  problem.  The  powerful  ODE  type  methods  have  been 
extended  in  a  fairly  general  context,  based  on  weak  convergence  ideas.  The 
results  and  proof  techniques  are  applicable  to  a  wide  variety  of  applications. 
Exploiting  the  full  potential  of  these  ideas  can  greatly  simplify  and  extend 
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much  current  work.  The  breadth  and  relative  ease  of  using  the  basic  ideas  is 
illustrated  through  typical  examples  from  discrete  event  dynamical  systems, 
piecewise  deterministic  dynamical  systems,  and  stochastic  differential  equa¬ 
tions  models.  The  algorithms  for  distributed/asynchronous  updating  as  well 
as  the  fully  synchronous  schemes  were  developed. 
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